A Case Study : E ects of With - Loop - Folding onthe NAS Benchmark MG
نویسنده
چکیده
Sac is a functional C variant with eecient support for high-level array operations. This paper investigates the applicability of a Sac speciic optimization technique called with-loop-folding to real world applications. As an example program which originates from the Numerical Aerodynamic Simulation (NAS) Program developed at NASA Ames Research Center, the so-called NAS benchmark MG is chosen. It comprises a kernel from the NAS Program which implements 3-dimensional multigrid relaxation. Several run-time measurements exploit two diierent beneets of with-loop-folding: First, an overall speed-up of about 20% can be observed. Second, a comparison between the run-times of a hand-optimized specii-cation and of Apl-like speciications yields identical run-times, although a naive compilation that does not apply with-loop-folding leads to slowdowns of more than an order of magnitude. Furthermore, With-loop-folding makes a slight variation of the algorithm feasible which substantially simpliies the program speciication and requires less memory during execution. Finally, the optimized run-times are compared against run-times gained from the original Fortran program, which shows that for diierent problem sizes, the code generated from the Sac program does not only reach the execution times of the code generated from the Fortran program but even outperforms them by about 10%.
منابع مشابه
A Seismic Factor of Radon Danger on a Case Study of Armenia
For the first time, on the basis of Spitak earthquake experience (Armenia, December 1988), it was found that an earthquake causes intensive and prolonged radon releases that are quickly dispersed in the open air and that is why they are not usually registered but contrastingly displayed in covered premises (such as dwelling houses, schools, kindergartens) even if they are at a considerable dist...
متن کاملAn Empirical Study of Cross - loopReuse in the NAS benchmarksKeith
This paper describes an empirical study designed to quantify the level of cross-loop reuse occurring in a set of scientiic Fortran programs, the NAS Benchmarks. Cross-loop reuse takes place when a set of data items or cache lines are accessed in a given loop nest and then accessed again within some subsequent portion of the program (usually another outer loop nest). In contrast to intra-loop re...
متن کاملAN EASY SOLUTION FOR THE DIVERTING LOOP COLOSTOMY: OUR TECHNIQUE
ABSTRACT Background: The loop colostomy is one of the most popular techniques used as a protective maneuver for a distal anastomosis and/or temporary fecal diversion. We are introducing the use of a full thickness skin bridge under the large bowel instead of a glass rod which alleviates problems such as protrusion of the large bowel, retraction of the bowel into the abdomen after removing the ...
متن کاملPerformance Coupling: Case Studies for Improving the Performance of Scientific Applications
Traditional performance optimization techniques have focused on nding the kernel in an application that is the most time consuming and attempting to optimize it. In this paper we focus on an optimization technique with a more global perspective of the application. In particular, we present a methodology for measuring the interaction, or coupling, between kernels within an application and descri...
متن کاملImplementation of NAS Parallel Benchmarks in High Performance Fortran
We present an HPF implementation of BT, SP, LU, FT, CG and MG of the NPB2.3-serial benchmark set. The implementation is based on HPF performance model of the benchmark specific primitive operations with distributed arrays. We present profiling and performance data on SGI Origin 2000 and compare the results with NPB2.3. We discuss advantages and limitations of HPF and pghpf com-
متن کامل